Conclusions of the 7th Crystal Structure Prediction Blind Test
The Cambridge Crystallographic Data Centre (CCDC) has recently closed the 7th Crystal Structure Prediction (CSP) Blind Test.
CSP is the ability to predict, from its 2-D molecular structure, the 3-D crystal structure(s) that a given molecule will form. Techniques used include informatics and computational science combined with intensive computational resources.
If 3-D crystal structures could be accurately and consistently predicted from 2-D drawings, stability risks could be predicted before costly experimental trials. This would be highly advantageous when developing pharmaceuticals and other solid form materials.
The search for the experimentally observed crystal structure begins with building a molecular model from a 2-D representation of a molecule. Advanced search techniques are then used to generate plausible crystal structures that can then be visualized and ranked on energy and density.
However, which crystal structure(s) occur experimentally is not that simple. The occurrence of many factors, including stable and meta stable polymorphs, packing complications, and the fact that the lowest energy structure is not always experimentally observed, presents CSP with many challenges. The CCDC's Blind Tests, which began in 1999, bring together scientists in this field to advance methods and overcome these and other challenges.
In the 7th CSP Blind Test that ran from October 2020 to September 2022, seven structures of 2-D systems that had been experimentally analysed but not published ('blind') were released to participants along with some experimental conditions.
The seven 2-D structures included Cu and Si-containing systems which pushed the boundaries of CSP beyond the pharmaceutical sector to areas such as electronics and photonics.
Other structures included one of the most challenging systems in CSP Blind Test history—a large, highly polymorphic, pharmaceutical drug candidate—along with agrochemicals and a food flavouring.
More experimental data was released over the course of the test to simulate real world conditions.
Participants included groups from both industry and academia, and all compounds successfully had their experimentally observed crystal structures predicted by at least one group from landscapes of 1K+ potential structures.
In one challenge, a PXRD pattern for an observed crystal structure was provided alongside the 2-D chemical structure. This emulated a common situation where a single crystal structure is unable to be obtained, but a poor-quality powder pattern is. This showcased an application of CSP to industrially relevant molecules.
CSP is an ever-evolving discipline and the 7th Blind Test identified future challenges including disorder prediction that had previously been disregarded owing to its complexity but is now more industrially relevant to solve as bigger molecules with more flexible groups are being seen in drug development; more challenges that reflect industrial reality; and continued broadening beyond pharmaceuticals to solid-state devices.
"Thanks to collaborative initiatives like the CCDC blind test, the complex field of CSP is advancing rapidly, taking advantage of machine learning techniques and the availability of ever-more powerful computing resource," says Dr. Jürgen Harter, CEO, CCDC. "I look forward to future Blind Tests addressing challenges such as structure ranking, overprediction and disorder."
A scientific paper with full results is being prepared and will be submitted for publication in 2023. Preliminary detailed results, including how many groups correctly predicted each target, the lowest rankings, CPU time used and details of methodologies available by emailing hello@ccdc.cam.ac.uk.
More information:
More info at The Cambridge Crystallographic Data Centre (CCDC).
Provided by CCDC—Cambridge Crystallographic Data Centre